Anwesh
Tuladhar, University of South Florida, , atuladhar@mail.usf.edu
PRIMARY
Sulav Malla, University of South Florida, sulavmalla@mail.usf.edu
Ghulam Jilani Quadri, University of South Florida, ghulamjilani@mail.usf.edu
Dr. Paul Rosen, University of South Florida, Tampa FL, prosen@usf.edu
Student Team: YES
Processing
3
Tableau
Apache
Spark
Excel
Approximately how many hours were spent
working on this submission in total?
80 Hours
May we post your submission in the
Visual Analytics Benchmark Repository after VAST Challenge 2017 is complete? YES
Video
OR
http://eng.usf.edu/~sulavmalla/sulav_malla_files/vast2017/USF-Tuladhar-MC1-Video.mpg
OR
http://eng.usf.edu/~sulavmalla/sulav_malla_files/vast2017/USF-Tuladhar-MC1-Video.wmv
Questions
1 – “Patterns
of Life” analyses depend on recognizing repeating patterns of activities by
individuals or groups. Describe up to six daily patterns of life by vehicles
traveling through and within the park. Characterize the patterns by describing
the kinds of vehicles participating, their spatial activities (where do they
go?), their temporal activities (when does the pattern happen?), and provide a
hypothesis of what the pattern represents (for example, if I drove to a coffee
house every morning, but did not stay for long, you might hypothesize I’m
getting coffee “to-go”). Please limit your answer to six images and 500 words.
For the “Patterns of Life” analyses,
first we enrich the given data in two steps:
The
map.
We developed a tool in Processing to represent the given map as
a weighted graph. First, we scan the map to find all the sensor locations,
which represents the nodes of the graph. Then we perform a modified Depth First
Search to find all the paths between all the nodes, which represents the edges.
The edges are weighted by the distance between the two nodes. Using this graph,
we can plot any path on the map. We also export data from this graph as a csv,
Sensor
data.
We developed another tool in Spark to aggregate the sensor data and
combine it with the graph data. We group the sensor data by “car-id” to trace
the path followed by each car in a day. Each such record now represents a trip for a car. We prevent dangling trips
by considering edge cases where a trip
spans two days by applying a heuristic that a trip can end either at a camping
site, ranger-base or an entrance. We use the graph data to calculate the
distance travelled, time taken and average speed for the trip. As a heuristic,
we choose the path where the speed of travel is closest to the speed limit when
multiple paths exist between sensors. We also maintain start gate, end gate and
the day of week for each trip. We also noticed that many trips follow the same
path in forward and reverse directions. So, we calculate the hash of the
forward and reverse paths for easier grouping during visualization.
Top
6 Patterns.
Using Tableau and the enriched data,
we plot the most frequently used paths which enter and exit the preserve within
a day. From Figure 1, we can see that the top 10 paths have a much higher count
than the rest. Figure 1 also shows the types of vehicles following those paths
and that all types of vehicles follow these paths.
Figure
1. Top 15 daily patterns through the preserve
The top 6 paths can be seen as a
subway map in figure 2.
Figure
2. Top 6 paths as a subway map
In figure 3, we can see that the
vehicles travel in both directions equally.
Figure
3. Top 6 paths separated by direction of travel
In figure 4, we look at the
distribution of the path usage over different days of the week. In figure 5, we
see the path usage over 24 hours in 30 minute intervals. From these, we can see
that these paths are travelled regularly in both directions, throughout the
week and in all hours of the day.
Figure
4. Top 6 paths by week of day
Figure
5. Start time distribution for top 6 paths.
In figure 6, we see that all vehicles
have an average speed of 35-36 mph. From this, we can assume that the cars are
not stopping anywhere on these paths.
Figure
6. Average speed of the vehicles following the top 6 paths
Conclusion.
From these figures, we conclude that the
top daily patterns followed by the cars are to go through the preserve.
2 – Patterns
of Life analyses may also depend on understanding what patterns appear over
longer periods of time (in this case, over multiple days). Describe up to six
patterns of life that occur over multiple days (including across the entire
data set) by vehicles traveling through and within the park. Characterize the
patterns by describing the kinds of vehicles participating, their spatial
activities (where do they go?), their temporal activities (when does the
pattern happen?), and provide a hypothesis of what the pattern represents (for
example, many vehicles showing up at the same location each Saturday at the
same time may suggest some activity occurring there each Saturday). Please
limit your answer to six images and 500 words.
For patterns spanning multiple days, we
further enrich the data in single day analysis by combining the trips that have
not yet exited the preserve and adding daily records information, end destination
of each day spent in the preserve and total days spent in the preserve.
In figure 1, we see the top paths spanning
multiple days. The top 6 patterns have a frequency of over 85. The top 5 paths
have camping5 as the camping destination.
Figure 1. Top paths spanning multiple days
In figure 2, we
show the subway map plots for the top 6 multiple day travel patterns.
Figure 2. Top 6 paths spanning multiple days as a subway map
Figure 1 suggests
that camping5 is the most popular camping site and we conform that from figure
3 where we see the top camping destinations. We also see that campers have
vehicles of type 1, 2 and 3. From figure 4, we see that camping5 is the most
popular camp site throughout the year.
Figure 3. Popular camp sites
Figure 4. Popular camp sites by month
In figure 4, we
look at the hours of the day when campers enter the preserve. We see that over
all camp sites, vehicles start at 6 am and have entered the park by 5:30 pm.
This suggests that the campers require some sort of permit from the preserve,
and the office is only open from 6 am to 5:30 pm.
Figure 5. Start time distribution for camp sites
Camping1 seems to
be the least popular camping site. But many cars (155 in total) still do pass
through camping1 and spend time there. Figure 5 shows details of trips
travelling through camping1 and spending time there. This suggests that camping
1 might be popular for day time hiking.
Figure 6. a) All trips passing through camping1. b) Trips passing
camping1 and finally camping else where
3 – Unusual
patterns may be patterns of activity that changes from an established pattern,
or are just difficult to explain from what you know of a situation. Describe up
to six unusual patterns (either single day or multiple days) and highlight why
you find them unusual. Please limit your answer to six images and 500 words.
Pattern
1.
In figure 1, we see that path 6 is
the least popular in the month of May but the most popular path in the month of
July. This change is unusual as this path includes the section
general-gate1:ranger-stop2:ranger-stop0:general-gate2. This section is special
because the preserve is divided into two sides which is only connected by 2
sections. One of them is this section and the other one
(gate6:ranger-stop6:gate5) is not allowed for general public.
Figure
1. Change in popularity of top single day path usage
Pattern
2.
In figure 2, we breakdown the top
multiple day paths from question 2 by the month. We see that the most popular
path to camping5 suddenly drops from July to August. This can indicate some
changes occurring in that area in that time.
Figure
2. Drop in multiple day path usage from July to August
Pattern
3.
In figure 3, we see that on July 10,
2015 six cars of type 1 travel the same path from entrance1 to ranger-stop1
around the same time. This activity raises suspicion as none of the cars
trigger the sensor on gate2 although it must be passed while going from
entrance1 to ranger-stop1. Also, access to gate2 is only allowed for park
rangers with car type 2P.
Figure
3. Illegal car type 1.
Pattern
4.
In figure 4a, we plot all the paths
passing a “gate”. Only ranger vehicles can pass these gates. But we see that 23
cars of type 4 also pass these gates. In figure 4b, we see that these cars take
this same path always on Tuesdays and Thursdays. And from figure 4c, we see
that these paths are travelled throughout the year and around the same time
from 2 pm to 4:30 pm, suggesting that something fishy is happening in this
route in those days.
Figure 4. a) All the paths which include a “gate”. Color indicates
the car type. b) The day of week when cars of type 4 travel this path. c) The
time of travel breakdown by months.
Pattern 5.
We found that a
large number of people are spending more than 10 days (up to 32 days) in the
camp sites (Figure 5.a, 5.b). We also established in question2 that camping1 is
the least popular camping site. Even there, we see from figure 5.c that some people spend an extended amount of time
(up to 13 days) in this site. Although extended camping is allowed in the
reserve, we suspect these people might have a different agenda to stay for
extended periods of time in all of the camp sites.
Figure 5. a) Number of people spending more than 10 days in the
park camp sites. b) Breakdown by number of days spent in the park. c) Days
spent in Camping1
Pattern 6.
We found a person
who has spent 350 days in the park and still hasn’t left. In figure 6, we see
the number of days he spent in each camp site with the date when he started
staying shown on top. We find this highly unusual. Also, even after spending
almost a year in the preserve, this person is yet to spent a day in camping1,
which makes statistics from figure 4 even more suspicious.
Figure 6. Days spent in each camp site by the person who spent
almost a year in the park.
4
–– What are the top 3 patterns you discovered that you suspect
could be most impactful to bird life in the nature preserve? (Short text
answer)
The
top 3 patterns that we suspect is causing the most impact on bird life in the
preserve are:
a. The daily
through traffic involves all the entrances to the preserve and causes
disturbances to the wild life 24 x 7 throughout the preserve. This must be
causing significant impact on the birdlife in the preserve.
b. The
unusual pattern which occurs on Tuesdays and Thursdays throughout the year is
also highly suspicious. The activities they are performing might also be
hurting bird life in the preserve.
c. The
campers staying for extended number of days in the park might also be
disturbing the natural habitat of the birds in those areas.